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Abstract 

P.L. Erdos and L.A. Szekely [Adv. Appl. Math. 10(1989), 488- 
496] gave a bijection between rooted semilabeled trees and set parti- 
tions. L.H. Harper's results [Ann. Math. Stat. 38(1967), 410-414] on 
the asymptotic normality of the Stirling numbers of the second kind 
translates into asymptotic normality of rooted semilabeled trees with 
given number of vertices, when the number of internal vertices varies. 
The Erdos-Szekely bijection specializes to a bijection between phylo- 
genetic trees and set partitions with classes of size > 2. We consider 
modified Stirling numbers of the second kind that enumerate partitions 
of a fixed set into a given number of classes of size > 2, and obtain 
their asymptotic normality as the number of classes varies. The Erdos- 
Szekely bijection translates this result into the asymptotic normality of 
the number of phylogenetic trees with given number of vertices, when 
the number of leaves varies. We also obtain asymptotic normality of 
the number of phylogenetic trees with given number of leaves and vary- 
ing number of internal vertices, which make more sense to students of 
phylogeny. By the Erdos-Szekely bijection this means the asymptotic 
normality of the number of partitions of n + m elements into m classes 
of size > 2, when n is fixed and m varies. The proofs are adaptations 
of the techniques of L.H. Harper [ibid.]. We provide asymptotics for 
the relevant expectations and variances with error term 0(l/n). 

1 Semilabeled trees and set partitions 

Peter Erdos and Laszlo Szekely [H] enumerated F(n, k), the number of rooted 
semilabeled trees with k uniquely labeled leaves and n non-root vertices. 
Such trees have a root, which may or may not have degree one, and is not 
being counted as vertex or leaf; and have k leaves. Two such trees are 
identical, if there is a graph isomorphism between them that maps root to 
root and every leaf label to the same leaf label. The labels of the leaves 
come from the set {1,2,..., k} and labels are not repeated. 

Erdos and Szekely in [8] established a bijection between the trees counted 
by F(n, k) and partitions of an n-element set into n — k + 1 classes, un- 
der which out-degrees of non-root vertices and the root correspond to class 
sizes in the partition. The cited result immediately implies that F(n,k) = 
S(n, n — k + 1), where S(a, b) denotes the Stirling number of the second kind 
that enumerates partitions of an a-element set into b non-empty classes; and 
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that J2k F ( n i k ) = E» S ( n i *) = the BeJJ number [18] A000110. Any in- 
formation available on the Stirling numbers of the second kind translates for 
information on the F-numbers. For example, the recurrence relation 

S(n, k) = S(n - 1, k - 1) + kS(n - 1, k) (1) 

translates to F(n,k) = F(n — 1, k) + (re + 1 — k)F(n — l,k — 1). How- 
ever, phylogeneticists are not interested in semilabeled trees with internal 
vertices of degree 2 and with root degree 1. We use the term phylogenetic 
tree for semilabeled trees that do not fall into these degenerate categories. 
Let F*(n,k) denote the number of phylogenetic trees with k leaves and n 
non-root vertices, and let S*(n,k) denote the number of partitions of an 
n-element set into k classes, such that each contains at least 2 elements. 
The Erdos-Szekely bijection still provides F*(n,k) = S*(n,n — k + 1) and 
S*(n,i) = F*(n,n-i + l). 

Felsenstein |10t [TT] , and also Foulds and Robinson |12j investigated the 
numbers T n ^ m . T n ^ m is the number of rooted trees with n labeled leaves, 
m unlabeled internal vertices (the root is one of them), where the root has 
degree at least 2 and no other internal vertices have degree 2. Clearly 

T njm = F*(n + m - 1, n) = S*{n + m - 1, m). (2) 

If we are interested only in evaluating certain T n ^ m numbers, formula (J2]) 
would suffice. However, the T n ^ m notation suggests that the distributions of 
F(n,k) and F*(n,k) for large but fixed number of vertices n and varying 
number of leaves k, albeit is mathematically interesting, not relevant for 
phylogenetics. The relevant distribution for phylogenetics is large but fixed 
number of leaves n, and varying number of internal vertices, with which 
total number of vertices varies as well. Let t n = T n ^ denote the number 
of all phylogenetic trees with n labeled leaves. This sequence is A000311 in 
[18] . which is the solution to Schroeder's fourth problem [TT] . 

This paper proves central and local limit theorems for the arrays S*(n, k) 
and T n fc, which translate into such results for F*(n, i) and S(n — 1 + m, m). 
We compute the expectations and variances with 0(1 /n) error term, to 
support the phylogeneticists who may use our results to approximate certain 
large numbers. The technique to be used is Harper's method |13j . and we 
heavily exploit far-reaching asymptotic results on Bell numbers. 

2 Harper's method 

Harper [13] made a very elegant proof for the asymptotic normality of the 
array S(n,k). We follow the interpretation of Canfield [2] and Clark [6], 
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who clarified and explained the details of [13], although our discussion is 
somewhat restrictive. Let A(n,j) be an array of non-negative real num- 
bers for j = 0, 1, ... , d n , and define A n (x) = Ylj A(n,j)x :) . Observe that 
Y2jA(n,j) = A n (l). Let Z n denote the random variable, for which the 

probability V{Z n = j) = J ^ n /{} ■ In terms of A n (x), there is a well-known [6] 
and easy to verify expression for the expectation and variance of Z n : 



civ \ A n (l) Ttitr? \ A' (l) I A' {x) 

£{Zn) = Ajl) ^ ViZn) = A^) + [Mx) 



(3) 



x=l 



As £{Z n ) and T>(Z n ) are determined by the array A(n,j), we will also write 
them as £(A(n, .)) and V(A(n, .)) 

The array A(n, j) is called asymptotically normal in the sense of a central 
limit theorem, if 

as n — > oo uniformly in x, where 

x n = £{Z n )+xV(Z n ). (5) 

Assume now that all the roots of the polynomial A n (x) are non-positive 
real numbers, say {—y n k ■ k = 1, 2, . . . , d n }. Define the independent random 
variables Y nk by V{Y nk = 0) = y nk / (1 + y nk ) and V{Y nk = 1) = 1/(1 + y n k)- 

Observe that the probability generating function of the random variable 
Z n is A n (x)/A n (l); and the probability generating function of the random 
variable Y nk is x ^r! nk . Since the probability generating function of a sum of 
independent random variables is the product of their probability generating 
functions, we have that the probability generating function of Y nk is 

nfeiiSS- H ° wever > as 



ni 



x + y nk A n (x) 



- + y n k A n (i) 

we conclude that Z n and Y^k Ynk have identical distribution. Let G n j(x) = 
V ( Yni D{z^)^ — x ) denote the cumulative distribution function of Yni x>{z^)^ 
for j = 1, . . . , d n . The Lindeb erg-Feller Theorem applies ([7] pp. 98-101) to 
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the sequence Z Z. = Y\a Yn L fy ? . The condition of the cited theorem, 

for all e > 

dn 



lim V / y 2 dG nj (y) = 



follows from 

lim V(Z n ) = oo. (6) 

n— too 

Therefore, the cited theorem proves the normal convergence provided 
([!]) holds and all the roots of the polynomials A n {x) have non-positive real 
numbers. 

A sequence is called unimodal, if first it increases, and then decreases. 
An array A(n, k) is called unimodal, if for every n, the sequence = A(n, k) 
is such. A sequence a&, which is for k < t and £ < k, with at ^ and ai / 0, 
is called strictly log-concave (SLC) if a\ — au-iak+i > for i + 1 < k < I— 1. 
An array .A(n, fc) is called strictly log-concave (SLC), if for every fixed n, 
the sequence = A(n, fc) is such. It is well-known and easy to see that any 
SLC sequence is unimodal in the variable k. Using Newton's Inequality, 
Lieb [H] showed that if a polynomial X]fc=i Ck% k has only real roots, then 
for jfe = 2,...,JV-l 

- - ( k \ { N — k + 1 



Q>C t+lCt _ ll _ jl __ j , (7) 

and hence the sequence is SLC, and showed the SLC property of S(n, k) 
through (0). 

E.R. Canfield [2 J noted that for asymptotically normal sequences (HJ), the 
SLC property and T>(Z n ) — > oo implies the following local limit theorem: 

¥ X^ ra >L*nJ) = -^e- 2 / 2 (8) 

n->oo Anil) V27T 



uniformly in x. Furthermore, from the fact that the convergence of the 
A(n,j) numbers to the Gaussian function is actually uniform, he concluded 
that the number k = J n maximizing A(n, k) satisfies 

J n -£{Z n ) = o(V(Z n ))- (9) 

and 

~7S A (10) 
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For the Stirling numbers of the second kind, A(n,j) = S(n,j), A n (l) = 
B n , and one has 

£(5(n,.)) = %± - 1, (11) 

^ 2 (5(n,.)) = ^-(%tl) 2 -l. (12) 

Harper [13] showed that S{n,k)x k has distinct nonpositive roots, that 
(|12p goes to infinity, which is sufficient for the asymptotic normality of the 
Stirling numbers of the second kind. Harper |13j already observed ([5]) for 
A{n,k) = S(n,k). 

The SLC property of S(n, k) implies the SLC property and unimodality 
of F(n,k). Consequently, the F(n,k) array is also asymptotically normal, 
in the sense of both the central and local limit theorems, with £(F(n, .)) = 
n + 1 - £(S(n, .)) and V(F(n, .)) = V(S(n, .)). 



3 Asymptotics for Bell numbers 

Asymptotic formula for the Bell numbers, in terms of the solution of the 
unique real solution of the equation re r = n, was obtained by Moser and 
Wyman [15j : B n ~ (r + exp[n(r + - 1) - 1](1 - ). Iter . 

ation easily gives r = r{n) = Inn — In Inn + O(l). The function r(n) is also 
known as LambertW(n). The explicit form of their result is not convenient 
to obtain asymptotics for the expectation and the variance, as r will vary 
with n. Canfield and Harper [5], Canfield [3J made minor modifications on 
the proof of Moser and Wyman [15] to develop an estimate for B n+ h, which 
holds uniformly for h = O(lnn), using a single r = r(n) value, as n — > oo:. 

(n + jQl e^'- 1 

n+h ~ r"+ ft (2vrS)V2 [L6) 
( Pp + /tPi + h 2 P 2 Qq + ftgi + h 2 Q 2 + fe 3 Q 3 + h±Qj 



where = (r 2 + r)e r , Pj and are explicitly known rational functions of r. 
We list and use in the Maple worksheet |19j their exact values from Canfield 
[1]. Using those, the formula ()13|) immediately provides asymptotics for 
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£{S{n, .)) and V(S(n, .)), as in [I] (note that [I] only claimed 0{r/n) error 
term in (|15j) ) : 

71 7" 1 

£(5(n, •)) = --! + TT2 + O(-). (14) 

•)) = -J^-TT + " 1 + O^)- (15) 

r(r + 1) 2[r + lj 4 n 



With symbolic calculations Salvy and Shackell [16] obtained the following 
asymptotics just in terms of n, with a compromise at the error term: 



n n(lnlnn + Q(l/lnn)) 

= t— + > (16) 

inn In n 



P 2 (S(n,.)) = -^ + n(21nlnn " 1 3 +Q(1/lnn)) . (17) 
In re In n 

4 Phylogenetic trees and set partitions without 
singletons 

Theorem 4.1. For the sequence A(n,j) = S*(n,j) the central limit theorem 
Q) and the local limit theorem (E|) hold with 

n 1 1 1 

£(S*(n,.)) = ~-r + — ^lO-), (18) 



r(r + l) r + 1 2(r + l) 2 

111 
2( r + i)3 + ( r + l)4 +0 (n^' (19) 

Furthermore, the number k = J n that maximizes S*(n,k) satisfies 



j n = ^ + o{ ^) (20) 
r r 

and 

S*(n,J n ) = r -^(l + o(l)). (21) 
V2n7T 

It is remarkable that making an asymptotic expansion in terms of r in 
(|18p . (|19p . after a few terms the error reduces to 0(l/n), as in the case of 
the Bell numbers in (|14p , (|15p . Using these asymptotic expansions we obtain 
that £(S*(n, .)) - £(S(n, .)) = 0(r) and P 2 (S*(n, .)) - P 2 (5(n, .)) = 0(r). 
Statement (i) below follows from these remarkably small differences. 
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Corollary 4.2. (i) \16\) and (11) still hold when S(n,.) is changed to 
S*(n,.). * 

(ii) A(n,k) = F*(n,k) satisfies g]) and (0) with £(F*(n,.)) = n + 1 - 
£(S*(n, .)) = n - n/r + r + 1 + o(l/r) and V(F*(n, .)) = T>(S*(n, .))• 

Proof to Theorem 14.11 We start with some facts that we need. Set B* = 
^ fc S"*(n,A:), the number of all partitions of an n-element set not using 
singleton classes [TSJ A000296. Becker [TJ observed tha10 

B n = B* +1 + B*. (22) 

From Bi = B* + B* +1 for i = 1,2, . . . ,n, and -B^ = 0, we obtain -B* +1 = 
^™ =1 i?j(— l) n_J . As the B n sequence is strictly increasing, we immediately 
obtain B t - B t ^ < B* +1 = ELi^- 1 )*"* < B t for f > 3 > and with 
t = n — h the asymptotical formula 

B* +1 = B n - B n „t + ... + {-l) h B n _ h + 0(B n _ h _ x ). (23) 

In the special case h = 0, using (fl"3l) . we obtain: 

= B n - 0(S n _i) = B n (l - O(^)) . (24) 

(It turns out, as a byproduct, that almost all set partitions contain a single- 
ton.) We obtain the recurrence relation 

S*{n, k) = (n- l)S*(n - 2, k - 1) + fc£*(n - 1, k), (25) 

according to the case analysis whether the n th element is in a doubleton 
class or not. We define the polynomial sequence S n (x) = J2 k S*(n,k)x k . It 
is easy to see that Si(x) = 0, 52 (x) = x, and for n > 3 from (|25l) . 

5 n (x) = (n - l)xS n ^ 2 (x) + xS^.^x). (26) 

For the proof, first we compute £(S*(n, .)) and V(S*(n, .)) exactly and then 
asymptotically. The central and local limit theorems hinge on D(S*(n, .)) — > 
oo. Formulae (120|) and (I2ip follow from (|9|) and (1101) . where -B* is approxi- 
mated with -B n _i by ()24p . Finally, Lemma [4.31 will provide the non-positive 
real roots of the generating polynomial. 



identity (I22[) can be proved by the following bijection from the partitions with at least 
one singleton class of an n-element set, [n], to the partitions without singleton classes of 
an n + 1-element set, [n + 1]: build a new class from the elements of all singletons and 
n + 1. 
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We obtain from ([3]), using ([26]) repeatedly, 

n n 

2^(5*(n,.)) = %* + 2n^g^ + »(«-!)%* 

u n \ n/ n 

-^n+l^i 2 2 f B n-l\ 2 B n-1 fn . ^ 



B* / \ B* J B 

To obtain (fTHj) and (fTUj) . we started with the closed forms above, used (f23|) 
for the B* numbers, and substituted the B numbers with (|13j) . changed e _r 
to r/n. For details, see the Maple worksheet [191 . 

Induction immediately gives from (|26[) that for n > 2 

deg(5„(x)) = [-J (27) 

and the root x = has multiplicity one. Hence <S^(0) > for n > 2. 

Lemma 4.3. Apart from x = 0, i/ie roots o/ S2n( x ) an d S2 n +i(x) are nega- 
tive real numbers and every root occurs with multiplicity one. Furthermore, 
if the roots of S2n( x ) are denoted by /3j- 2 ™^ in increasing order, and the roots 
of #2n-l (x), S2n+i{x) are denoted by a[ 2n X \ a[ 2n+1 \ both in increasing 
order, then the following interlacing properties hold: 

p{ 2n) < a (2n-l) < p{ 2n) < ^n-l) < ^ ^n) < ^n-l) = Q = ^ 
p{ 2n) < a (2n + l) < ^n) < ^1) < . . . < ^1) < ^n) < ^n+l) < ^ = q = 

Proof. We will use mathematical induction on n. The roots of S2 (x) = 
S 3 ( x ) = x, 5 4 (x) = 3x 2 + x (roots /3j 4) = -1/3 and /3^ 4) = 0) and S 5 (x) = 
10x 2 +x (roots af' = —1/10 and = 0) satisfy Lemma l4T3l The inductive 
step follows from the following two statements for k > 2: 

(i) If the roots of S2 n -2{x) and S2 n -i{x) occur with multiplicity one and 
satisfy 

^n-2) < a (2n-l) < ^2n-2) < ^(2^-1) < < a (2n-l) < ^n-2) = Q = ^n-l^ 

then the roots /3^ 2n ^ of S2 n (x) satisfy 

p (2n) < a (2n-l) < ^n) < a (2n-l) < ^ ^n) < ^n-l) = Q = ^(2 n )_ 
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(ii) If the roots of S^n-lO^) and S2 n (x) occur with multiplicity one and 
satisfy 

Pf »> < af n ^ < $ n) < af n - 1] <■■■< fjW < a^ l) = = fl<*0 
then the roots Q^ 2n+1 ^ of S2n+i{x) satisfy 

p (2n) < a (2n+l) < gn) < a <8n+l) < < ^1) < ^n) < q (2„+1) < ^ = 

First we prove (i). In our setting the identity ([26]) specifies to 

S 2n (x)/x = (2n - l)S 2n ~2(x) + S^_ a (x), (28) 

where the RHS is the sum of two polynomials of degree n — 1 and n — 2, 
respectively. 

Set «Q 2n ^ = — oo. The proof hinges on the following three claims: 

• The sign of S2 n -2{x) alternates on a\ n , a£f™ ^ for % = 0, 1, n — 3; 

• The sign of S' 2n _i{x) alternates on oif n *\ ol\^ x ^ for i = 1, n — 2; and 

• sign(S , 2n _ 2 (aS 2n " 1) )) = sign(5 2n _ 1 (ai 2n_1) )). 
The first claim follows from the hypotheses. 

The second claim follows from the fact that S' 2n _i(x) is a polynomial of 

degree n — 2 and it has exactly one root in every interval (a\ 2n ^ , c^+i ^ ) 
for i = 1,2, ...,n — 2, as it must have a root between consecutive roots of 
S 2n -i(x). 

The third claim follows from the facts that 

sign(S , 2„_ 2 (a 1 2n ~ 1) )) = -sign(S' 2n _ 2 (-oo)) = -(-l) n_1 , as S 2n _ 2 (x) has a 
single root, (3^ n , which is less than (x\ n ^; and sign(5 2n _ 1 (a[ 2n ^)) = 

sign(S' 2n _ 1 (— oo))) = (— l) n ~ 2 , as S' 2n _ 1 (x) has no root less than a[ 2n . 

From the three claims and (j28|) follows that the sign of S' 2n (x)/x, and 
hence of 52 n (x), alternates on a\ 2n 1 \aj 2 _™ ^ for i = 1, — 3; and this 
fact provides the required root between these numbers, i = 1, n — 3. 

From the proof of the third claim and (|28j) follows that sign(S' 2n (a^ 2n 1 ) / oi\ a ^ ) 
(— l) n . If we show that S' 2n (x)/x has a different sign at — oo, then we pro- 
vided the required /3[ 2n ^ < of 2n root for S2 n (x)/x, and hence for S' 2n (x). 
Indeed, the degree of S' 2n _ 2 (x) is greater than the degree of S' 2n _i(x), and 
therefore the sign of S' 2n (x)/x at — oo is the sign of 5" 2n _ 2 (x) at — oo, namely 
(-If- 1 - 

As 5' 2n _ 2 (a^ 2 ™ 1 ^) = S , 2n -2(0) = 0, the second and the third claim, 
and (|28j) imply that 5 2n (x)/x alternates on a^ 2 ™ 2 ^j^n-l l \ providing the 
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required root between these numbers, also for S 2n {x). Finally, the last 
root to find is fin"^ = 0. 

Next we prove (ii). In our setting the identity (j26|) specifies to 

S 2n+ i(x)/x = 2nS 2n ~i{x) + S' 2n (x), (29) 

where the RHS is the sum of two polynomials of degree n — 1. The proof 
hinges on the following three claims: 

• The sign of S 2n -\{x) alternates on (3^ 2n \ P^ 2 ^ for i = 1, ...,n — 2; 

• The sign of S' 2n (x) alternates on (3j- 2n \ /3!ffl for i = 1, n — 1; and 

. sign( ( S 2n _ 1 (# n )) = sign(S 2n (/f *>)). 
The first claim follows from the hypotheses. 

The second claim follows from the fact that S' 2n {x) is a polynomial of de- 
gree n — 1 and it has exactly one root in every interval j3 < f n ^ , for 
i = 1,2, n — 1, as it must have a root between consecutive roots of ^^.(x). 
The third claim follows from the facts that 

sign(5 2n _ 1 (/3f n )) = sign(S 2n _i(-oo)) = and sign(^ n (/3f n) )) = 

sign(52 n (— oo))) = (— l) n , as neither S' 2n {x) nor S 2n -i(x) has a root less 
than pf n) . 

From the three claims and (j29[) follows that the sign of S 2n +i(x)/x, and 
hence of S 2n+ i(x), alternates on /3^ 2n \ /Sj+j for i = 1, n — 2; and this fact 
provides the required ct^ 2n+1 ^ root between these numbers, i = 1, ...,n — 2. 

As S 2n -i(/3n^) = 5*2n-i(0) = 0, the second and the third claim, and 
([29]) imply that S 2n+ \{x) / x alternates on P^-ii Pn\ providing the required 
root a^ 2 ™^ 1 ^ between these numbers, also for S 2n {x). Finally, the last root 
to find is cth n+1) =0. □ 



5 Phylogenetic trees and set partitions in another 
distribution 

Theorem 5.1. For the array A(n,j) = T n+ ij, the central limit theorem ^ 
and the local limit theorem hold with 

\ 1-P 3/4 -In 2 _ . . 

£ (T„ + i . = — -Z-n + - + 0(l/n) and 

2p p 



re/1 2 \ l + 41n2-81n 2 2 
+1 ' = 4 " 1, ~ + V + ° (1/n) 
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where p = —1 + 2 In 2. Furthermore, the number k = J n that maximizes 
T n+ i t k satisfies 

J n = — — n + o( v / n), (30) 
2p 

and 

n!(l + o(l)) 

J-n+l,J n - , i / = r =- (31J 

^np"+2./(4-|-l) 



Identity ([2]) immediately implies the following central and local limit theo- 
rems as corollaries: 



+2 



— V S*(n + j,j) -+^= I 



e- 1 / 2 dt (32) 

oo 



and 

V(T, 



Jim 1 n+1 -V (ra + I x n \ , [x n \ ) -> ^e"* 2 / 2 (33) 

n-»oo f n+1 - ^/27T 



as n — > oo, uniformly in x, with x n = £(T n+ i r ) + x£>(T n+lj .). 

Proof to Theorem 15.11 Felsenstein |10^ [TT] proved the recurrence relatiord 

T„ )fc = (n + k - 2)T n _i )fe _i + fcT n _i jfe (34) 

for > 1 with the initial condition T n i = 1 for n > 1. Consider the 
polynomials P n (x) = T n+ \^x . Then P n (l) = i n +i and the degree of 
P n {x) is n. Felsenstein's recurrence relation ()34|) implies the identity 

P„(x) = nxP n -i{x) + {x + x^P'^x) (35) 

with initial term Pq(x) = 1, Pi(x) = T2,ix = x. We have for the expectation 
and variance, from ([3]), using (f35|) repeatedly, 

^(Tn + i,.) = ^"-^5 (36) 

2 

-n2/ T \ *"+3 *n+2 *n+2 n+1 

P (i„ + l,J = — —2 q+ 2~ " ^ > 

4t„ + i 4t n+1 2l n+ i 4 

Consider the following bivariate generating function for T n ^: 

H{x,z) = VVVl = Y J Pn-i{x)— v (38) 

n>l A: n>l 



2 The recurrence is based on a case analysis whether the n th leaf is to be grafted into 
an edge or to be joined to an internal vertex of an already existing tree with n — 1 leaves. 
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in particular, H(l, z) = jj + + ^ + + ... . Flajolet [9] observed the 
functional equation 

H (x, z) = z + x [e H ^ z) - 1 - H(x, z)) , (39) 

which immediately follows from the Exponential Formula, and obtained from 
this equation an expression for H(l, z) in terms of the Lambert function: 

H(l,z) = -Lambertw(-\e !L z L \ + Z - . 



He also observed that H(l,z), the EGF of the t n sequence, has a singu- 
larity at p = — 1 + 2 In 2, and it is the only singularity at this radius; and 
furthermore, for \z\ < p, there is a singular expansion of H(l, z) in terms of 
A = \J\ — z/p, of which the first few terms are 

H(l, z) = In 2 - ypA + (- - - In 2j A 2 - A 3 + 0(A 4 ) . (40) 

Flajolet [9] used (j40]l to obtain asymptotic formula for t n and noted that 
asymptotic expansion can be obtained by this method. Using Maple, we 
went further and obtained the following asymptotic expansion: 

n! / 1 3 25 / 1 \\ 

tn ~ ^n-l ^^7^ + 16^57^ + 256n 7 /2 + °l^J J • (41) 



Combining (|36j) and (|37j) with (JHj), one obtains the asymptotics for the 
expectation and the variance in Theorem 15.11 The details are on a Maple 
worksheet 120 . 



Lemma 5.2. For n > 1, the polynomial P n (x) has n distinct real roots, one 
of them is zero, and the other n — 1 roots are in the open interval (—1,0). 

Proof. We prove the theorem with mathematical induction on n. The small 
cases above are easy to verify. It is easy to see (by a different induction) 
that Pi(-l) = -1 and from {35]), P n (-1) = (-n)P n _i(-l), thus 

sign(P n (-l)) = (-1)™. (42) 

Using the induction hypothesis, let the roots of P n {x) be 

— 1 < Ct\ < ■ ■ ■ < Ct n _2 < Ctn-l < a n = 0. 
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By Rolle's theorem, P^(x) has a root in (a,,aj + i) for i = 1, 2, n — 1. 
From (|35|) . observe that sign(P n+ i(/3j)) = — sign(P n (/3j)). As the sign of 
P n (x) must alternate on the Pi, so does P n+ x{x), and therefore P n+ i(x) has 
a root in (/3j, /3j+i) for i = 1, 2, n — 2. We have to find 3 more roots: one 
is x = 0, and we will show that the other two are in the intervals (— 
and (/3 n _i,0), respectively. 

Indeed, sign(P n (x)) differs in —1 and since P n (x) has a single root ot\ 
between. Also, sign(P n+ i(-l)) = - sign(P n (-l)) by gSJ and sign(P n+ i(/3i)) = 
— sign(P n (/3i)) from our earlier observation. Hence, sign(P n+1 (x)) differs in 
— 1 and and therefore P n+ i(x) has a root in (— l,/3i). 

Observe (j35j) with induction implies that for n > 1 the coefficient of x n 
in P n {x) is positive. On one hand, we have that for x < but x sufficently 
close to zero, sign(P n+ i(x)) = —1. On the other hand, sign(P n+ i(/3i)) = 
-sign(P n+1 (-l)) = (-1)™, sign(P n+1 (A)) = (-I)"*" 1 , andsign(P n+1 (/3 n _ 1 )) = 
1. Therefore P n+ x{x) has a root in (/3 n _i,0). □ 
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